Artificial Error Generation with Machine Translation and Syntactic Patterns
نویسندگان
چکیده
Shortage of available training data is holding back progress in the area of automated error detection. This paper investigates two alternative methods for artificially generating writing errors, in order to create additional resources. We propose treating error generation as a machine translation task, where grammatically correct text is translated to contain errors. In addition, we explore a system for extracting textual patterns from an annotated corpus, which can then be used to insert errors into grammatically correct sentences. Our experiments show that the inclusion of artificially generated errors significantly improves error detection accuracy on both FCE and CoNLL 2014 datasets.
منابع مشابه
A Study of Translation Problems of Tourism Industry Guidebooks: An Error Analysis Perspective
This study was motivated by the researchers’ goal to unfold the quality of the English translations of Persian tourism industry texts and discover the most frequent error patterns the Iranian non-native translators have committed in such texts. Thus, the following research questions were addressed: 1) Are the English versions of Persian tourist guidebooks and multimedia compact discs provided b...
متن کاملA Study of Translation Problems of Tourism Industry Guidebooks: An Error Analysis Perspective
This study was motivated by the researchers’ goal to unfold the quality of the English translations of Persian tourism industry texts and discover the most frequent error patterns the Iranian non-native translators have committed in such texts. Thus, the following research questions were addressed: 1) Are the English versions of Persian tourist guidebooks and multimedia compact discs provided b...
متن کاملArtificial error generation for translation-based grammatical error correction
Automated grammatical error correction for language learners has attracted a lot of attention in recent years, especially after a number of shared tasks that have encouraged research in the area. Treating the problem as a translation task from 'incorrect' into 'correct' English using statistical machine translation has emerged as a state-of-the-art approach but it requires vast amounts of corre...
متن کاملTopicalization in English Translation of the Holy Quran: A Comparative Study
The Holy Quran, as an Arabic masterpiece, comprises great domains of syntactical, phonological, and semantic literary patterns. These patterns work as the shackle of translators. This study examined the application of the most common shift strategies in Catford‟s linguistic model for translation of topicalization in chapter 29 of the Holy Quran. The topicalized cases were compared to their coun...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کامل